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A classical model for social-influence-driven opinion change is the threshold model. Here we study cascades 
of opinion change driven by threshold model dynamics in the case where multiple initiators trigger the 
cascade, and where all nodes possess the same adoption threshold (j). Specifically, using empirical and 
stylized models of social networks, we study cascade size as a function of the initiator fraction p. We find that 
even for arbitrarily high value of ^, there exists a critical initiator fraction p c (0) beyond which the cascade 
becomes global. Network structure, in particular clustering, plays a significant role in this scenario. Similarly 
to the case of single-node or single-clique initiators studied previously, we observe that community structure 
within the network facilitates opinion spread to a larger extent than a homogeneous random network. 
Finally, we study the efficacy of different initiator selection strategies on the size of the cascade and the 
cascade window. 

I t has long been known through empirical studies that in a population of socially interacting individuals where 
I each individual node holds an opinion from a binary set, a small fraction of initiators holding opinion opposite 
I to the one held by the majority can trigger large cascades and eventually result in a dominant majority holding 
the initiators' opinion. Some recent studies have investigated such phenomena in the context of the adoption of 
scientific, cultural, and commercial products 1,2 . One of the simplest models that captures adoption dynamics, 
irrespective of context, is the threshold model 3 " 6 . According to the threshold model, an individual changes its 
opinion only if a critical fraction of its neighbors have already adopted the new opinion. This required fraction of 
new adoptees in the neighborhood is designated the adoption threshold 3,7 . Here, we denote the adoption threshold 
by (j). Since its introduction 3 , the threshold model has been studied extensively on complex networks to analyze the 
conditions under which a vanishingly small fraction (of the total system size) of initiators is capable of triggering a 
cascade of opinion change 4 ' 6 ' 8 . In particular, these studies considered initial conditions with a single "active" node 4 
or an active connected clique (a single node and all of its neighbors) 6 as initiators. In this scenario, the condition 
for global cascades in connected sparse random networks is (j) < l/(/c) 4 ' 6,8 , where (k) is the average degree of the 
network. However, with a few exceptions 9 " 11 , little attention has been paid to the question of how the size and the 
selection of this initiator fraction affects the spreading of an opinion in the network, in particular, in the regime 
where a single active node or a small clique is insufficient to trigger global cascades. 

In case of multiple initiators, how to select these initiators from among the nodes of the network so as to 
maximize the spread (cascade size), remains an open question. To address this issue we compare three different 
heuristic ways of selecting a set of initiators with predefined size, on Erdos-Renyi (ER) random networks 12 . 
Specifically, we look at the size of the spread for a varying range of the average degree (k) of the ER networks. 
As found earlier for the case of cascades triggered by single initiators 4 ' 8 , we find that when the average degree is too 
low or too high, large cascades are not triggered. However, within an intermediate range of (k), large cascades are 
realized. This range is referred to as the cascade window 5 . We find that the width of this cascade window is the 
largest when the initiator nodes are selected successively in descending order of degree starting with the node 
having the largest degree. We also find that the total time taken for the cascade to terminate is shortest for this 
selection strategy. 

In both ER 4 and empirical 6 networks it was observed that for a given (k), there is a critical threshold (j) c such that 
cascades are only triggered if (j) < (j) c for a single-node or a very small initiator set 4,6 . Here, we systematically study 
the effect of varying the initiator fraction p with </> held fixed, for the entire range of values of the adoption 
threshold (j). We find that for any given threshold (j) < 1 there exists a critical value of the fraction of initiators p c , 
above which global cascades can be triggered. We discuss the dependence of p c on (/) which turns out to be a 
smooth curve separating the two phases, one in which cascades are observed and the other where cascades cannot 
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be triggered. This finding constitutes an important insight into how 
local neighborhood-level thresholds can constrain the emergence of 
tipping points for cascades on global scales on sparse graphs. We 
note that in Refs. 9,10, the authors went beyond basic heuristic selec- 
tions of the initiators (targets) by employing a systematic greedy 
selection and a scalable influence maximization algorithm, respect- 
ively; however they did not explore the region for global [O(N)] 
cascades (and the corresponding tipping point p c of initiators 
required to trigger them), but rather, only focused on the p^l 
regime. In Ref. 11, assuming locally tree-like structures, the authors 
developed an asymptotic approach to approximate the size of the 
cascades. This method is expected to work better for random graphs 
with small average degree (with negligible presence of triads) and to 
gradually break down for graphs with higher (k). We will comment 
on its applicability in determining the tipping point p c {<j>) in the 
Results section. 

Details of the network structure beyond the average degree (k), 
also play an important role in the spreading process 13 . The network's 
degree distribution and the presence of community structure and 
local clustering can significantly affect the dynamics of spreading 
and vulnerability to cascades in both social networks (driven by 
influencing) 6 ' 1415 and infrastructure networks (driven by load-based 
failures) 16 . 

To elucidate the effect of clustering, we study the effect of network 
rewiring on the cascades triggered by different methods. Specifically, 
starting from an empirical network with a community structure and 
relatively high clustering, we redistribute the links in the network 
while preserving the original degree sequence, using a number of 
different methods. The cascade sizes are found to be larger and more 
likely in the original network which, in addition to having an inher- 
ent community structure, has much higher clustering coefficient 
(essentially capturing the density of triads) 13 . These results indicate 
that local clustering, just like in the case of a single node (or single- 
clique) initiator 615 , facilitates the spreading of global cascades in the 
case of multiple initiators as well. 

A recent study 17 also considered cascades in the threshold model 
in multiplex networks (a natural framework and terminology for 
interdependent networks 18 " 20 in the social setting). In this case, indi- 
viduals can be connected by multiple types of edges (representing 
multiple kinds of social ties, e.g., colleagues, friends, or family). It was 



shown that multiplex networks facilitate cascades, i.e., increase the 
social network's vulnerability to spreading 17 . 

Results 

In the threshold model, every node in the network can be in one of 
the two possible states, 0 (inactive) or 1 (active), that can be also be 
thought of as signifying distinct binary opinions on an issue. The 
typical initial condition for studying threshold model dynamics is 
one where all nodes except a minority - the initiators - are in state 0. 
Then, the dynamics proceeds as follows. At each time step, a node is 
selected at random. If the node is inactive, it becomes active if at least 
a threshold fraction (j) of its neighboring nodes are active i.e. in state 1 . 
The active state is assumed to be permanent i.e. once a node becomes 
active it remains active indefinitely. The system evolves according to 
these rules until no further activations can occur. The threshold in 
general, can be different for every node but for simplicity, we con- 
sider the case where every node has the same threshold. The size of 
the cascade at any point during its evolution or after it has termi- 
nated, is quantified by the fraction of active nodes in the network. In 
the following sections we discuss the simulation of this dynamics for 
various network topologies. 

Selection strategies. The decision that a node will adopt 1 depends 
only on the states of its neighbors. If the fraction of its neighborhood 
in state 1 exceeds (j) then the node updates its state. As a result of this 
threshold condition a node's degree plays an important role in 
determining how easily it can be influenced. The threshold 
condition is more easily satisfied for a low- degree node than a 
high- degree node, since the former requires fewer active nodes to 
be present than the latter, given a fixed adoption threshold (j) for all 
nodes. Similarly, the average degree of the network (k) determines to 
what extent, if at all, the entire network can be influenced. For a fixed 
number of initiators, high degree nodes are less likely to get 
influenced because it is more difficult for their neighbors to satisfy 
the threshold condition. A high (k) is therefore not a desirable 
condition for cascades. On the other hand, for low (k), the network 
consists of disconnected clusters of sizes less than O(iV), and cascades 
remain confined to one or few of these clusters. As a result, global 
cascades only become possible in an intermediate range of (k) - the 
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Figure 1 | Cascade size S as a function of the average degree on ER networks of N = 1 000 nodes with threshold 0 = 0.18 for different selection strategies of 
multiple initiators for (a) p = 0.01; for (b) p = 0.02. Time evolution of the average cascade size S on ER networks of N = 1000 nodes with average 
degree (k) = 6.0 and threshold (/) = 0.18 for different selection strategies of multiple initiators for (c) p = 0.01; for (d) p = 0.02. 
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Figure 2 | Cascade size and scaled cascade size as a function of initiators on ER networks with (k) = 10.0. (a) Cascade size S as a function of initiators 
p for ER networks with N = 10000 for different values of (/). (b) Scaled cascade size S [Eq. (1)] vs. p for ER networks with different network sizes N 
and (f> values. 



cascade window. In general, cascade window sizes depend on both, 
the threshold (j), and the initiator fraction p. 

The precise choice of initiators also plays an important role in the 
size of the cascade and consequently the cascade window itself. A 
strategic selection of initiators can dramatically increase the average 
size of the spread, which we denote by S. Here, we compare three 
heuristic strategies for selecting a set of initiators constituting a frac- 
tion p of the total network size: (/) random selection, (it) selecting 
nodes in the descending order of their degrees, and (Hi) selection in 
the descending order of /c-shell index 21 . In (it) and (Hi), the choice of 
initiators may not be unique. If there are many sets of initiators that 
can be selected for the same degree (or /c-shell), one of these sets is 
selected at random. 

The simulation results are shown in Fig. 1(a) for a fixed fraction of 
initiators p = 0.01 on an ER graph with N = 1000 and <j> = 0.18. We 
first look at the average spread size as a function of average degree (k) 
on an ER random graph as shown in Fig. 1(a). When (k) is small, all 
three strategies perform equally well because the network consists 
only of small clusters without a giant component and hence spread is 
localized to those clusters. As soon as (k) becomes large enough for a 
giant component to arise, the spread covers a large portion of the 
network. Further increasing (k) makes it harder for the nodes to 
satisfy the threshold condition and S decreases again. 

To understand the differences in the performance of these heur- 
istics, we first note that there are two distinct aspects determining the 
efficacy of a node as an initiator. First, it must be capable of influ- 
encing a large number of nodes, i.e. it should have a large degree. 
Second, it must be connected to nodes which have an easily satisfiable 
threshold condition i.e. the degrees of its neighbors must be suffi- 
ciently low. Additionally, and related to the first point, it also makes 
sense to choose the highest- degree nodes as initiators, since they are 
the hardest to influence. In light of these arguments, the highest- 
degree selection strategy appears to be a natural choice for generating 
large cascades. It would appear that high /c-shell nodes are a com- 
parably good choice, since high /c-shell nodes also possess a high 
degree. However, by construction, nodes in the highest k- shells are 
a special subset of the high-degree nodes that are predominantly 
connected to other nodes of high-degree. In other words, nodes 
selected in descending order of their /c-shell index have fewer easily 
influencable neighbors than nodes selected purely on the basis of 
degree. This qualitatively explains why the /c-shell method does not 
perform as well as the high-degree selection. Finally, the random 
selection works the poorest since it largely selects low- degree nodes 
which trigger a small number of cascades many of which frequently 
terminate when they encounter a high- degree node. 

An increase in the initiator fraction p makes the cascade window 
wider by allowing cascades to occur for even higher (k) values as 



shown in Fig. 1(b) where p is increased to 0.02. The selection strat- 
egies follow the same ranking in this case as well. 

Results obtained from simulations indicate that highest degree 
method also works better (followed by the /c-shell method) in terms 
of the speed of the cascade. The results forp = 0.01 andp = 0.02 are 
shown in Figs. 1(c) and (d), respectively. 

Tipping point for multiple initiators. As discussed in the previous 
section, for a small (0(1) -size) seed of initiators, cascades can only 
occur if (j) is smaller than a critical value (<j> < ll{k) for sparse random 
graphs 4 ' 6,8 ). However, this does not hold if we introduce a sufficiently 
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Figure 3 | (a) Critical fraction of initiators (obtained by simulation and 
analytic approximation 11 , see Supplementary Information Section S.3) for 
global cascades p c as a function of the local threshold-value (j) for ER 
networks of size N = 5000 with various values of the average degree. The 
dashed line corresponds to the exact limiting case on large complete graphs 
(fully-connected networks), p c ~ 0. (b) Critical fraction of intiators for 
three different selection strategies for ER networks of (k) = 10 and N = 
5000. 
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Figure 4 | Time evolution of the size of the cascades 5 on the high-school (HS) network and its randomized version by X-swaps with identical 
degree sequence, with N = 921, (1c) = 5.96, and (j> = 0.18 for two different values of fraction of initiators, (a) HS friendship network and (b) 
its x-swapped randomized version, (c) Direct comparison of the ensemble- averaged time series for the original HS network (red solid curve) and for 
its x-swapped randomized version (green solid curve); blue solid curves represent conditional average over runs for which the spread reaches the entire 
network. Thin black curves in (a) and (b) are individual time series. The fraction of initiators for (a-c) is p = 0.01. (d) HS friendship network and (e) 
its x-swapped randomized version, (f) Direct comparison of the ensemble- averaged time series for the original HS network (red solid curve) and for 
its x-swapped randomized version; blue solid curves represent conditional average over runs for which the spread reaches the entire network. Thin black 
curves in (d) and (e) are individual time series. The fraction of initiators for (d-f) is p = 0.02. 



large fraction of initiators in the system. We look at the quantity S 
(average fraction of nodes in state 1) as a function of p. (We will refer 
to S as cascade size for short.) Gradually increasing p shows that in 
the beginning whenp^Cl, (global) cascades are not observed. When 
p reaches a critical value p a a discontinuous transition occurs and 
large cascades are seen immediately as shown in Fig. 2(a). The need 
for a minimum critical fraction of committed nodes for consensus 
has been observed in different models of influence 22 " 24 (see 
Discussion for more details). 

Since starting with a finite p itself accounts for a large number of 
nodes in state 1, the relevant quantity to look at is the number of 
nodes that were initially in state 0 and eventually adopted state 1 (i.e., 
excluding the initiators). Thus, we define 



which measures the fraction of non-initiator nodes that participate in 
the cascade. Transitions in S are shown in Fig. 2(b) for different </> 
values and several network sizes. It can clearly be seen that the trans- 
ition only depends upon </> and is independent of system size N. This 
transition (the emergence of the tipping point) is quite generic in the 
threshold model, and can be observed in networks with different 
sizes and average degrees, as well as for different selection methods 
for initiators (see Supplementary Information Sections S.l and S.2 
for more details). 

The critical point p c in each case is calculated by numerically 
computing the derivative of S with respect to p and finding its max- 
imum. Having calculated p c allows us to explicitly look at the rela- 
tionship between p c and (/) as shown in Fig. 3(a) for different average 
degrees (k). As (k) increases, all curves appear to converge to the 
limiting case of the fully- connected network (complete graph) for 
which p c = (j). Therefore, for a given threshold (j) the minimum 
number of initiators needed to trigger large cascades can be esti- 
mated. We also employed a previously developed asymptotic me- 
thod 11 to estimate p c {(j)) analytically (see Supplementary Information 



Section S.3 for more details). This method uses a tree-approximation 
for the network structure and calculates the cascade size by assuming 
a progressive, directed activation of nodes from the surface of the tree 
to the root. Consequently, the method works well only for low (k) and 




Figure 5 | Visualizations of spreading in the threshold model (typical 
individual runs) for various networks at different times during evolution 
(arrow on top indicates the direction of time evolution). N = 921, (k) = 
5.96, p= 0.01 and0 = 0.18. Nodes in state 1 (active nodes) are colored red. 
(a) Original high-school network; (b) Randomized network (by X- Swap) 
when eventual spread is local; and (c) The same randomized network but 
for a run that reaches the whole network. 
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Figure 6 | Cascade size and probability of global cascades in the high-school (HS) network and its two randomized versions with identical degree 
sequence (X-swapped and exact sampling 28 ) with N = 921 and (k) = 5.96. Cascade size S as a function of (j) for (a) p = 0.01 and for (b) p = 0.02. 
Probability of global cascades P c as a function of (/) for (c) p = 0.01 and for (d) p = 0.02. 



low p. For large (k), the tree-approximation breaks down, while for 
large p, deviations from the assumed progressive and directed activa- 
tion of levels, become significant. The comparison of the analytically 
predicted p c using this method to values obtained from simulations 
clearly show regions of approximation validity and breakdown 
[Fig. 3(a)]. 

For a fixed (k) = 10 and N = 5000, we also studied by simulations 
how the selection of initiators affect the critical fraction p c . 
Simulation results in Fig. 3(b) show that selection of initiators by 
their degree works better than the other two methods across the 
range of threshold <j). 

Impact of network structure and clustering. In this section we 
study how the dynamics of the threshold model is affected by 
structural changes in the network. We study the dynamics on an 
empirical high- school friendship network, using one particular 
network from the Add Health data set (also employed in 14 ) and a 
few degree-sequence preserving randomized versions of it. [Add 
Health was designed by J. Richard Udry, Peter S. Bearman, and 
Kathleen Mullan Harris, and funded by a grant P01-HD31921 
from the National Institute of Child Health and Human Develop- 
ment, with cooperative funding from 17 other agencies. For data files 
contact Add Health, Carolina Population Center, 123 W. Franklin 
Street, Chapel Hill, NC 27516-2524, addhealth@unc.edu, url: , http:// 
www.cpc.unc.edu/projects/addhealth/ (Accessed June 20, 2013).] To 
simplify things, we extract the giant component from the high-school 
network which has N = 921 nodes and (k) ~ 5.96. Hereafter, we only 
consider the giant component of this network and refer to it as the 
high-school network. The initiator fraction is kept fixed atp = 0.01. 
The network contains two communities which are roughly equal in 
size. We generate two distinct ensembles of networks from this 
high-school network by employing the following randomization 
methods: 

1. The link swap method (henceforth referred to as x-swap) in 
which two links are selected at random and then one end point 
of a link is swapped with the end point of the other link. An x- 



swap step is disallowed if it results in fragmentation of the 
network. This swapping is done repeatedly so that the network 
is randomized to an extent that any community structure, local 
clustering, or degree- degree correlation is eliminated 25-27 . 
2. The exact sampling method by Del Genio et al. (DKTB) 28 , a 
connected network is constructed from the degree sequence of 
the original network. The algorithm takes as input the exact 
degree sequence of the network and joins the link stubs from 
different nodes until every stub has been paired with another 
stub 28 ' 29 . 

Both methods of randomization leave the degree sequence 
unchanged. (Results for x-swapped and exact sampling 28 are very 
similar and we only show them in detail for the former.) We look 
at the size of spread S as a function of time forp = 0.01 in the original 
high-school network Fig. 4(a) and the x-swapped high-school net- 
work Fig. 4(b), while Fig. 4(c) shows the direct comparison between 
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Figure 7 | Average cascade size as a function of time in the high- school 
network (HS), in its two randomized versions with identical degree 
sequence (x-swapped and exact sampling 28 ), and in ER networks with the 
same average degree. N = 921, (k) = 5.96, p = 0.01 and (f> = 0.18 in all 
cases. 
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Figure 8 | Cascade size and scaled cascade size as a function of initiators on the high-school network (N = 921, (k) = 5.96). (a) Cascade size S as a 
function of initiators p for different values of (j). (b) Scaled cascade size S [Eq. (1)] vs. p for different values of 0. 



the corresponding ensemble-averaged time series. Analogous plots 
foxp = 0.02 are shown in Figs. 4(d-f). For the empirical high-school 
network, some runs reveal the existence of community structure in 
the network where spread is faster in one community compared to 
other. More specifically, in some of these runs, the cascade first 
sweeps one of the communities (while the other one resists) before 
it becomes global. This can be seen by the step-like evolution in the 
corresponding time series in Fig. 4(a) [randomized networks do not 
exhibit this behavior, see Figs. 4(b)]. The same phenomena can also 
be observed in the configuration snapshots in Fig. 5(a), while their 
randomized counterparts do not show this behavior [Fig. 5(b,c)]. In 
general, the results show that triggered cascades are larger and more 
likely for a network with high local clustering than for a randomized 
network with the same degree sequence [Fig. 4], although the impact 
of clustering is diminishing for larger values of p. Note that the 
clustering coefficient of the original high-school (HS) graph is C HS 
~ 0.125; for its randomized versions obtained by x-swaps (XS) and 
exact-degree sequence (DKTB 28 ) construction are C X s ~ C DKTB ~ 
0.008 (see Supplementary Information Section S.4 for more details). 

The average cascade size S [Fig. 6(a) and (b)] and the probability of 
global cascades P c [Fig. 6(c) and (d)] as a function of threshold (j) also 
indicate that strong clustering (present in empirical networks) 
facilitates threshold-limited spreading. (We define a global cascade 
as a cascade that covers at least 60% percent of the network size N.) 
Hence, this important feature of threshold-limited spreading 615 is 
preserved for the case of multiple initiators studied here. 

The temporal evolution of the average cascade size in the original 
HS network, its two randomized versions, and an ER network of the 
same size and with the same average degree is shown in Fig. 7. The 
two methods of randomization (x-swap and exact sampling) roughly 
give the same cascade size S. In case of randomized networks, for 
some realizations spread reaches the full network [Fig. 5(c)] and for 
some realizations spread is minuscule [Fig. 5(b)] and therefore S<1. 

Finally, analogous to Fig. 2, we show the emergence of global 
cascades (at the tipping point p c ) in the high-school network, as 
the density of initiators is varied [Fig. 8]. 

Discussion 

Several recent studies have addressed, for a variety of agent-based 
opinion spreading models, the impact of a special set of initiators viz. 
inflexible individuals 22 , also referred to synonymously as commit - 
ted 14,23 ' 24,30 " 34 or stubborn 35 agents, true believers 36 , zealots 37 " 39 , or 
inflexible contrarians 40,41 . The rules of state updating (or opinion 
switching) in these models is symmetric, and governed purely by 
the local density of states in the neighborhood of a node. In such a 
system, the inflexible nodes constitute a special set of nodes which 
never change their opinion, thereby breaking the symmetry of the 
system and giving rise to tipping points beyond which the entire 



network conforms to the state adopted by the committed agents. It 
has been shown that the emergence of tipping points in some of these 
models is related to metastable regions and barriers (saddle points) in 
the corresponding opinion landscapes 23 ' 30,31 . Because these models 
allow frequent changes of state or opinion at the individual level, 
these models are more suitable for scenarios where switching an 
individual's state incurs virtually no cost. 

In contrast to the above models, the threshold model (or the 
qualitatively similar threshold contact process 42 " 45 ) is more suited 
to modeling the diffusion of innovations or adoption of new products 
where investment in a new idea comes at a cost, and the incentive to 
switch back after becoming active is low. Here, spreading is an asym- 
metric process and is also inhibited by a local threshold: individuals 
can only adopt the new product or norm if a sufficient fraction of 
their neighbors have already done so. (The threshold model or 
threshold contact process, in spirit, is closer to the family of 
Susceptible-Infected-Susceptible- or contact-process-like mod- 
els 21,46 " 51 , in that the spreading of a disease or norm is an inherently 
asymmetric process by the rules of the local dynamics.) 

The focus of this work was to identify tipping points for global 
cascades triggered by multiple initiators and governed by local 
thresholds. Our findings demonstrate that these tipping points 
emerge in both ER and empirical high- school networks, in a qualita- 
tively similar fashion. 

Further, we studied three different heuristic strategies to select a 
fraction of initiators for the threshold model on ER network as well as 
on an empirical network. Our results demonstrate that selecting 
initiators by their degree (highest first) results in the largest (as well 
as fastest) spread. Naturally, for high values of the local threshold (</> 
> l/(/c)), single initiators or small cliques cannot trigger global cas- 
cades. We showed by simulations that there exists a critical value of 
initiator fraction p c that is needed to trigger cascades for high values 
of (j). We also studied how structural changes, such as randomizing an 
empirical network using different randomizing methods, would 
affect the size of the cascades triggered (in the cases studied here) 
by multiple initiators. Our simulation results on the empirical high- 
school network show that randomizing the network in fact results in 
narrower cascade windows compared to the original network with 
strong clustering, implying that clustering facilitates spreading in 
threshold-limited diffusion with multiple initiators. 
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